ACIRD: Intelligent Internet Documents Organization and Retrieval
نویسندگان
چکیده
In this paper, we present an intelligent Internet information system ACIRD using machine learning techniques to organize and retrieve Internet Web documents. ACIRD consists of three parts: knowledge acquisition, document classifier and two-phase search engine. The knowledge acquisition of ACIRD automatically learns the classification knowledge from classified Internet Web documents and the classifier applies the classification knowledge to classify newly collected Internet Web documents to one or more classes in a class hierarchy. The experiments show that ACIRD performs as good as or better than human experts in both knowledge extraction and document classification. Based on the learned classification knowledge and the given class hierarchy, the ACIRD two-phase search engine presents hierarchically navigable structured results to the users instead of conventional flat ranked results that greatly helps users in discovering information from diversified Internet documents.
منابع مشابه
ACIRD: Intelligent Internet Document Organization and Retrieval
This paper presents an intelligent Internet information system, Automatic Classifier for the Internet Resource Discovery (ACIRD), which uses machine learning techniques to organize and retrieve Internet documents. ACIRD consists of a knowledge acquisition process, document classifier and two-phase search engine. The knowledge acquisition process of ACIRD automatically learns classification know...
متن کاملACIRD: An Intelligent Internet Information System Based on Data Mining (Extended Abstract)
The explosive growth of the Internet dramatically changes the way of working and living that the Internet becomes a major source of information. However, the excessive information on the Internet creates the information overflow problem. As a result, information retrieval (IR) systems (or search engines) come to help the Internet users to alleviate the problem. The conventional IR systems are d...
متن کاملIDSIS: Intelligent Document Semantic Indexing System
System Zhongzhi Shi Bin Wu Qing He Xiujun Gong Shaohui Liu Yi Zheng [email protected] Key Laboratory of Intelligent Information Processing , Institute of Computing Technology ,Chinese Academy of Sciences Abstract: With rapid growth of the Internet, how to get information from this huge information space becomes an even important problem. In this paper, An Intelligence Document Semantic Indexi...
متن کاملA Framework of Personalized Intelligent Document and Information Management System
A framework for building a personal information system is introduced. A personal information system can not only help users managing their personal documents, but also act as an intelligent agent to search and collect useful documents from existing information system including Internet on behalf of users. The system employs a flexible dual-model approach. A document type hierarchy describes the...
متن کاملMultiple Word senses and Information Retrieval: An application using thesaurally derived Lexical Chains
The primary objective of this work is to Improve Internet based Information Retrieval. Currently Internet search engines retrieve a heterogeneous collection of documents of varied quality. Whilst many are “relevant” to the search terms used, many others coincidentally contain a matched word. They do not, in other words, have meaningful content. An enabling objective is to develop a "weakly" int...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002